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1. INTRODUCTION 

Energy prediction have become important recently [1-3] due to increase on research for energy 
savings quantification purposes. Energy savings quantification were made possible with the help of good and 
accurate baseline energy models. Baseline energy models is a tool that depict the energy consumption of a 
particular system before and after changes are being made within the system. From the developed baseline 
energy model, it can facilitate building owners, energy managers as well as utility companies to plan, 
manage, and propose steps for efficient energy management. It is indeed a life savior for all of the respective 
person to obtain a good baseline energy model because the energy consumption and independent variables is 
well presented with a mechanism that is suitable for the efficient energy management purposes. Before any 
baseline energy model are being developed, suitable candidates for the independent variables that may have 
an effect on the energy consumption have to be identified. Identification of the independent variables is 
highly depending on type of buildings [4, 5], measurement option [6, 7], location [8-10] and climate 
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surroundings [11]. Selected independent variables that have been chosen are then will be fed to next 
important procedure i.e. modelling and prediction. Modelling and prediction can be the most pivotal part to 
achieve a good baseline energy model. Modelling is a process that involves choosing the suitable model to 
represent the relationship of input and output. In this particular case, the output is energy consumption and 
input is the independent variables. Prediction lies within the capability of the model that have been 
selected to further asses the results of future value that upholds the behavior of the input and output of the 
particular system. 

Furthermore, quantification of energy saved after certain energy conservation measures (ECM) have 
been conducted relies heavily on the predicted energy consumption. Energy saved after ECM implementation 
is the absence of energy. Absence of energy is the difference of the measured energy consumption after the 
implementation of ECM with the energy consumption if there are no ECM were implemented. In order to 
execute this calculation, energy consumption after ECM implementation have to be predicted in assumption 
that during this period there are no ECM implementation. This is due to the fact that the independent 
variables that govern the energy consumption such as meteorological condition, working days and occupancy 
have to be discriminated with the impact of the ECM that was implemented. Such measuring and 
quantification of energy savings framework is available in The International Measurement and Verification 
Protocol IPMVP) [12]. Known for its deterministic and simple approach, linear regression model has been 
adopted in the IPMVP standard for the energy consumption prediction purposes. It is even worth mentioning 
that linear regression model is a choice preferred amongst researcher in the field of energy prediction and 
load forecasting [7, 11, 13, 14] 

Despite being accepted amongst researchers [7, 11, 13, 14], linear regression model may provide 
inaccuracy on modelling and prediction since independent variables and energy consumption may have non- 
linear relationship. Nevertheless, certain good effort to improve the accuracy of linear regression model is 
visible by introducing clustering of independent variables [15], representing hypothesis for regression model 
[16] but resulting on complex model. Development of these models may require huge number of datasets to 
work with hence posed difficulties on its implementation. Avoiding such difficulties, simple models is 
required for modelling and prediction purpose that are able to deal with non-linearity characteristics exist in 
the independent variables that produce accuracy and possible to be trained with medium amount of data. 
Thus, the intention of this paper is to present a non-linear model to develop a baseline energy models and 
performing energy consumption prediction. Non-linear Auto Regressive with Exogenous Input (NARX) will 
be used as the non-linear model. The NARX model will utilize multilayer perceptron artificial neural network 
(MLP-ANN) as the model estimator and will be named as NARX-ANN model. A case study will be 
conducted in an educational building in Malaysian University. The develop baseline energy models and 
energy consumption prediction will be compared with Multiple Linear Regression (MLR) model. 


2. RESEARCH METHOD 

The methodology that will be describe in this section is the research designed to achieve the desired 
objectives. This research start with selection of academic buildings. Energy consumption and independent 
variables will be measured and collected and will be fed to the MLR model and NARX-ANN model. 
Baseline energy model will be developed and energy prediction will be conducted. Analysis of comparison 
between two model will be conducted to assess the accuracy of the baseline energy model develop using 
MLR and NARX-ANN model. Certain NARX-ANN algorithm was referred in [17] as this is work is a 
continuation from the studies. The research end with a conclusion drawn to summarize the best and 
accurate model. 


2.1. Academic Buidings 

The buildings that have been chosen to be the case study in this research is Faculty of Electrical 
Engineering’s (FKE) building. It is located in Universiti Teknologi) MARA (UiTM) Johor branch Pasir 
Gudang campus. The FKE’s building buildings is shown in Figure 1. It consists of sixty lecturer’s office 
rooms, five meeting rooms, four class rooms and fourteen laboratories. The building has centralized air 
conditioning system classrooms, laboratories and meeting rooms except in lecturer’s office room where the 
air conditioning is a split unit system. The energy consumption will be measured during March 2018-July 
2018 lecture week session that consist of 14 weeks lecture week period. The energy consumption will be 
measured again during September 2018 — January 2019 lecture week session for energy prediction 
comparative purpose. 
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Figure 1. FKE’s building 


2.2. Energy Consumption Measurement 

Data logger Fluke 1750 will be used for the measurement of energy consumption for the FKE 
buildings. The data logger will be logged at the main switch room that is receiving the electricity supply from 
main 33/11kV transformer that have been stepped down to 11kV/415V. The data logger will be tapped at the 
415V busbar voltage at 15-minute interval sampling time. The energy consumption is then will be aggregated 
to become 30-minute interval and the baseline energy model will be developed at hourly interval of the 30 
minutes aggregated interval data. This step is being conducted to decrease the oscillation of the data which 
directly may decrease the computational time [18]. 


2.3. Independent Variables 

Selection of the independent variables that may have an effect towards the energy consumption does 
not have any standard procedure. It may base on decision made instinctively or with several algorithms to 
rank the best independent variables that is having a high impact on energy consumption. As it leads to model 
complexity, the independent variables in this research is based on instinctive selection. The independent 
variables candidate for FKE’s building is staff occupancies, student’s occupancies in classroom, student’s 
occupancies in laboratories and outside temperature. 

In Table 1, staff occupancies in the FKE’s building will be counted from the daily attendance record. 
The occupancy in the FKE’s building was counted based on staff who were occupying their rooms during 
office hours form 8.00 am untill 5.00 pm. The office rooms were occupied by lecturers and eventhough there 
are classes to attend, the occupacies will be counted in assumption that the loads in the rooms is not switch 
off. Student occupancies in class rooms and laboratories were counted based on the number of student 
registered for the current semester’s subject available on students timetable system. The outside temperature 
will be collected from the nearest sattelite and weather station available in www.weatherundergroud.com. 
All independent variables will be collected during March 2018- July 2018 session. 
2.4. Multiple Linear Regression Model 

Linear regression model is a linear model that uses linear equation for its model estimator. 
Regression model can be described as purely mathematical model. The most well-known regression model 
multiple variable linear regression (MLR) model. MLR model holds the relation of an input function 
denotes Y and assumes that Y is depending on its variables X,, X2, X3 ...X, linearly. The equation of the 
MLR model can be written as in (1). 


Y = BiX, + BX. ++ BnXn + Bo +E (1) 


As shown in (1), Y is the input of the function where X is the variables that governs the input. 
By, B2..By is the slope of the equations also known as the constant or parameter for the variable of X. 
Linear regression model is widely accepted and adopted to perform prediction studies based on its simplicity. 
In this research the energy consumption measured and the collected independent variables during the month 
of March 2018 will be the training data and the baseline period of the MLR model. The energy prediction 
will be conducted based on the baseline model developed using March 2018 as the training data and the 
remainder of March 2018 — July 2018 data will become the testing data for energy consumption 
prediction purposes. 
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2.5. NARX-ANN Model 

Non-linear Auto Regressive with Exogenous (NARX) input is a time series non-linear model 
derived from the Autoregressive exogenous (ARX) linear model and have been widely used for modelling 
and prediction purposes [19-25]. The model consists of structure which is repeated in the dynamic network 
with feedback connections. NARX can be describe in the (2) 


y(t) = fOv(t — 1), y(t — 2), ... yt —n), u(t — 1), u(t — 2)...u(t -—n)) + & (2) 


The output y(t) is the outcome of from past input y(t — 1), y(t — 2), ...y(t — n) and the exogenous 
input u(t — 1),u(t — 2)...u(t—n) where it is approximated by a non-linear function f in (2). 
The past output, past input and the non-linear function also knows as the tapped delay lines) of the 
NARX model. 

The multilayer perceptron artificial neural network (MLP-ANN) is the model estimator that is 
widely used with the NARX model and is called NARX-ANN [26]. The MLP-ANN consists of input layer, 
hidden layer and output layer. The MLP-ANN receive an independent input that will be connected with the 
neuron in the hidden layer. The input will be multiplied with weight and added with biased in the 
interconnection of input layer and hidden layer. Summation will occurred in the neuron where activation 
function f(@N) with weight W that have been multiplied with input U and bias b in (3) and (4). 


f(@N) _ W,U, + W,U, + +W3U3 + b (3) 


The connection of the MLP-ANN is a pure feedforward in motion where by it is deemed to be the 
most suitable model estimator for NARX model. The MLP-ANN will be train using Levenberg-Marquadt 
learning algorithm for its fast-computational time [27, 28]. In this work, the energy consumption that was 
measured and the independent variables data collected will be divided into 70% training 15% validation and 
15% testing distribution [29] by interleaved method for modelling purpose. The number of neurons and 
tapped delay lines of the NARX-ANN model will be determined by trial and error method. The baseline 
energy model developed from the NARX-ANN model will be used to perform the energy prediction in one 
step ahead horizon. The one step ahead prediction answer will be compared with the March 2018 — July 2018 
measured data. 


2.6. Statistical Measurement 

In order to demonstrate a fair comparison, the predicted energy consumption from both of the model 
using March 2018 — July 2018 data will be compared again with energy consumption that is measured from 
September 2018 —January 2019 lecture week session. The measured energy consumption during September 
2018 — January 2019 will not be used in any training and testing purposes in MLR model and NARX-ANN 
model for baseline energy model development and prediction purposes. Comparison between actual and 
predicted energy consumption will be executed by means of Mean Squared Error (MSE), Root Mean Square 
Error (RMSE) and Mean Absolute Percentage Error (MAPE). All of the equation are being shown in (4), As 
shown in (5) and (6) respectively. As shown in (4) and (5) Y; is the predicted value and y; is the actual value. 
As shown in (6) A, is the actual value while F, is the forecasted value. 


7; 
MSE =+52.,(%,- y,? (4) 
NM (yi_wi)2 
RMSE = wee (5) 
100vin = [At—Ft 


MAPE = yn, 








(6) 


At 


3. RESULTS AND DISCUSSION 

The results and discussions of the proposed research methodology frame work will be separated into 
three sections. The first section and second section is the result that was achieved from MLR model and 
NARX-ANN model respectively. In addition, the discussions will include the results of the developed 
baseline energy model and the energy prediction in FKE’s building. The comparison of both model with 
actual data measured during September 2018 — January 2019 will be presented in third section. 
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3.1. MLR Model 

The whole measured energy consumption on March 2018 for FKE’s building that have been fed to 
the multiple linear regression model. The total data point during March 2018 is 456 data points. The data 
point that have been fed to the multiple regression model yields the (2) and which are the baseline energy 
model for FKE’s building. 


Y = 1.3664X, + 0.0154X, — 0.0024X3 + 3.0735X, — 59.4566 (2) 


As shown in (9) X,,X2,X3 and X, is the independent variables for staff occupancies, student 
occupancies in classrooms, student occupancies in laboratories and temperature. The coeeficient is positive 
for staff occupancies, student occupancies in classrooms and temperature except for student occupancies in 
laboratories which shows a negative coefficient value. The results of the regression statistics is shown in 
Table 2. In Table 2, the R value which is the correlation coefficient achieve high value at 0.89 and 0.88 for 
FKE’s building. In addition the R* that represent the goodness of fit of the model have a significant high 
value which is 0.80. 

The remaining independent variables data of the measured energy consumption from FKE’s 
building is being inserted in (9). The prediction of the energy consumption using the remaining data is being 
shown in Figure 2. In Figure 2 the red dashed line is the predicted energy consumption while the blue bold 
line is the measured energy consumption. It is observable from Figure 2, there is a high deviation at almost 
all of the higher peak value of the energy consumption. Similar prediction pattern is visible at the lower peak 
of the predicted energy consumption where certain deviation between predicted value and the measured value 
is presence. It is safe to assume that mutliple linear regression model perform well in coefficient correlation 
(R) and coefficient of determination (R’) but at prediction level, certain value at high peak and lower peak 
have certain amount of deviation which may decrease accuracy. 


Table 2. FKE’s Building Regression Statistics 











Regression Statistics FKE 
Multiple R 0.89 
R Square (R’) 0.80 
Standard Error 16.831 
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Figure 2. FKE’s building predicted energy consumption 


3.2. NARX-ANN Model 

The NARX- ANN models start with initializing the neuron numbers and tapped delay lines for the 
model. Several trial and error on the selection of neuron numbers and the tapped delay lines of the NARX- 
ANN model have been tested. The number of neurons, input and output tapped delay lines that produce the 
best answer is being shown in Table 3. The value of 50 neuron numbers, 20 input and output tapped delay 
lines for FKE’s building shows the correlation value of 0.98. From this correlation value, it clearly shows that 
NARX-ANN model outshines the MLR model where it can give higher correlation value hence indicating 
that the NARX-ANN model has successfully build a strong linear relationship between the independent 
variables with respect to energy consumption. The predicted energy consumption based on one step ahead 
prediction of the NARX-ANN model has been plotted and is shown in Figure 3. In Figure 3, it is observable 
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that the one step ahead predicted energy consumption has become similar to the measured (actual) energy 
consumption. This further strengthen that NARX-ANN model is capable to model the non-linearity behavior 
that exist between the independent variables with the energy consumption. From visual inspection, it can be 
clearly seen that the similarity is better compared to MLR model energy consumption prediction. 


Table 3. FKE’s Building NARX-ANN Parameters 








Parameters FKE 
Input Tapped Delay Lines 20 
Output Tapped Delay Lines 20 
Neuron 50 
Correlation Coefficient (R) 0.98 








Figure 3. FKE’s building predicted energy consumption 


3.3. Comparison between MLR Model and NARX-ANN Model 

The MSE, RMSE and MAPE of the predicted energy consumption using MLR model and NARX- 
ANN model have been calculated and is shown in Table 4. In Table 4, the calculated value of MSE, RMSE 
and MAPE is based on the predicted value from Figure 2 - 3 which is the energy consumption during March 
2018 — Sept 2018. The value of MSE and RMSE is in kWh. The MLR model have a high value of MSE and 
RMSE. compared to the NARX-ANN model. This value of MSE and RMSE comparison indicate strongly 
that the energy consumption predicted using NARX-ANN model produce lower error compared with the 
MLR model. Furthermore, the MAPE value of NARX-ANN model is lower compared to MLR model. It is 
further understood that NARX-ANN model has higher accuracy of modelling and prediction because MAPE 
results indicates the deviation between measured energy consumption with the predicted value. 
Small deviation value demonstrated that the deviation of percentage error between actual value and predicted 
value is small and the model is reliable for prediction purpose. 

In order to provide a further concrete proof on the accuracy and reliability of NARX-ANN model, 
a comparison between predicted energy consumption value using March 2018 — Sept 2018 data will be 
compared with the energy consumption measured during Sept 2018 — Jan 2019. The measured energy 
consumption during Sept 2018 — Jan 2019 does not involve with any modelling and prediction purpose. 
The comparison between predicted value using energy consumption and independent variables during March 
2018 — Sept 2018 with the measured energy consumption during Sept 2018 — Jan 2019 is shown in Figure 4. 

In Figure 4, MLR model display the same behavior with prediction using Sept 2018 — Jan 2019 data 
where certain high peak and lower peak value is deviating with the measured consumption. The comparison 
with NARX-ANN model may have certain weaknesses where there are certain outliers exists at high peak 
and lower peak value. The calculated MSE, RMSE and MAPE value is shown in Table 5. The indication of 
NARX-ANN model performed better than MLR model is presence due to lower value of calculated MSE, 
RMSE and MAPE. NARX-ANN model is still able to relate the energy consumption with the respective 
independent variables where it is manifested by lower MAPE value that indicates the percentage of error is 
smaller compared to MLR model. 
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Table 4. Energy Prediction March 2018 — Jun 2018 Error Comparison 











Model MSE RMSE MAPE 
MLR Model 234.43 15.31 0.32 
NARX-ANN Model (one step ahead) 37.16 6.09 0.08 
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Figure 4. FKE’s building predicted energy comparison with sept 2018 — Jan 2019 energy consumption 


Table 5. Energy Prediction Comparison with September 2019 — January 2019 Energy Consumption Error 
Comparison Independent Variables for FKE and Al-Khawarizmi’s Building 








Model MSE RMSE MAPE 
MLR Model 234.43 15.59 0.27 
NARX-ANN Model (one step ahead) 162.66 12.75 0.22 





4. CONCLUSION 

This paper has presented the development of baseline energy model and prediction of energy 
consumption using linear and non-linear modelling technique. An academic building in Malaysia during 
lecture week have been used as case studies. MLR model as a linear model and NARX-ANN model as a non- 
linear model have been used for modelling and prediction purposes. From the conducted research and 
experiments, NARX-ANN model exhibit results that are more satisfactory compared to MLR model with 
lower MSE, RMSE and MAPE results. In addition, the predicted energy consumption using NARX-ANN 
model resembles closely with the measured energy consumption. Even though there are certain demerit point 
provide by the NARX-ANN model, the model performs relatively well in executing the desired objective of 
this research. Thus, from the results obtained in this research, it is safe to assume that NARX-ANN as a non- 
linear model provide high accuracy compared to MLR model as a linear model in developing the baseline 
energy model and performing energy prediction. 
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